Automated Ontological Gene Annotation for Computing Disease Similarity

نویسندگان

  • Sachin Mathur
  • Deendayal Dinakarpandian
چکیده

The annotation of gene/gene products with information on associated diseases is useful as an aid to clinical diagnosis and drug discovery. Several supervised and unsupervised methods exist that automate the association of genes with diseases, but relatively little work has been done to map protein sequence data to disease terminologies. This paper augments an existing open-disease terminology, the Disease Ontology (DO), and uses it for automated annotation of Swissprot records. In addition to the inherent benefits of mapping data to a rich ontology, we demonstrate a gain of 36.1% in gene-disease associations compared to that in DO. Further, we measure disease similarity by exploiting the co-occurrence of annotation among proteins and the hierarchical structure of DO. This makes it possible to find related diseases or signs, with the potential to find previously unknown relationships.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

GandrKB--ontological microarray annotation and visualization

SUMMARY The Gandr (gene annotation data representation) knowledgebase is an ontological framework for laboratory-specific gene annotation. Gandr uses Protege 2000 for editing, querying and visualizing microarray data and annotations. Genes can be annotated with provided, newly created or imported ontological concepts. Annotated genes can inherit assigned concept properties and can be related to...

متن کامل

EST-PAC HPC – a web portal for high-throughput EST annotation and protein sequence prediction

Expressed Sequence Tags (ESTs) are short DNA sequences generated by sequencing the transcribed cDNAs coming from a gene expression. They can provide significant functional, structural and evolutionary information and thus are a primary resource for gene discovery. EST annotation basically refers to the analysis of unknown ESTs that can be performed by database similarity search for possible ide...

متن کامل

Investigating Semantic Similarity Measures Across the Gene Ontology: The Relationship Between Sequence and Annotation

MOTIVATION Many bioinformatics data resources not only hold data in the form of sequences, but also as annotation. In the majority of cases, annotation is written as scientific natural language: this is suitable for humans, but not particularly useful for machine processing. Ontologies offer a mechanism by which knowledge can be represented in a form capable of such processing. In this paper we...

متن کامل

Semantic Annotation for Web Service Processes in Pervasive Computing

In this chapter, we propose a new approach to the discovery, the selection, and the automated composition of distributed processes, in a pervasive computing environment, described as semantic web services through a new semantic annotation. In our approach, we map a process in a pervasive computing environment into a State Transition System and we semantically annotate it with a minimal set of o...

متن کامل

Automated Gene Ontology annotation for anonymous sequence data

Gene Ontology (GO) is the most widely accepted attempt to construct a unified and structured vocabulary for the description of genes and their products in any organism. Annotation by GO terms is performed in most of the current genome projects, which besides generality has the advantage of being very convenient for computer based classification methods. However, direct use of GO in small sequen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 2010  شماره 

صفحات  -

تاریخ انتشار 2010